Summarizing Linked Data RDF Graphs Using Approximate Graph Pattern Mining
نویسندگان
چکیده
The Linked Open Data (LOD) cloud brings together information described in RDF and stored on the web in (possibly distributed) RDF Knowledge Bases (KBs). The data in these KBs are not necessarily described by a known schema and many times it is extremely time consuming to query all the interlinked KBs in order to acquire the necessary information. To tackle this problem, we propose a method of summarizing large RDF KBs using approximate RDF graph patterns and calculating the number of instances covered by each pattern. Then we transform the patterns to an RDF schema that describes the contents of the KB. Thus we can then query the RDF graph summary to identify whether the necessary information is present and if so its size, before deciding to include it in a federated query result.
منابع مشابه
GRAPHIUM: Visualizing Performance of Graph and RDF Engines on Linked Data
Graph size, density, and number of labels negatively impact on the performance of all the engines. Graph summarization seems to be more affected by the graph density and the number of labels. Dense graph is more influenced by the size of the graphs. RDF-3X outperforms the rest of the engines in pattern matching and graph creation. DEX seems to overcome the rest of the engines when the graphs ar...
متن کاملRDF2Vec: RDF Graph Embeddings for Data Mining
Linked Open Data has been recognized as a valuable source for background information in data mining. However, most data mining tools require features in propositional form, i.e., a vector of nominal or numerical features associated with an instance, while Linked Open Data sources are graphs by nature. In this paper, we present RDF2Vec, an approach that uses language modeling approaches for unsu...
متن کاملRDF2Vec: RDF Graph Embeddings and Their Applications
Linked Open Data has been recognized as a valuable source for background information in many data mining and information retrieval tasks. However, most of the existing tools require features in propositional form, i.e., a vector of nominal or numerical features associated with an instance, while Linked Open Data sources are graphs by nature. In this paper, we present RDF2Vec, an approach that u...
متن کاملExploring Linked Data Graph Structures
The true value of Linked Data becomes apparent when datasets are analyzed and understood already at the basic level of data types, constraints, value patterns etc. Such data profiling is especially challenging for Rdf data, the underlying data model on the Web of Data. In particular, graph analysis can be used to gain more insight into the data, induce schemas, or build indices. We present ProL...
متن کاملSemantic Web Mining using RDF Data
Information on the web is increasing every minute. Redundancy in information is growing rapidly. Data mining is the technique used to extract this data as per the user’s query. Technically data mining analyzing and summarizing it into useful information. Keyword search is an important tool for exploring and searching large data corpuses whose structure is either unknown, or constantly changing....
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2016